专利摘要:
CONTINUOUS FLOW OF ENCODED VIDEO DATA. A source device can flag characteristics of a description of a presentation media (MPD) file, such that a destination device can select one of a number of presentations corresponding to the MPD file and retrieve one or more video files from the selected presentation. In one example, an apparatus, for transporting encoded video data, includes a management unit configured to receive encoded video data, comprising a number of video segments and presentation forms, comprising a number of video files, each one among the video files corresponding to the respective one of the video segments, and a network interface, configured to, in response to a request specifying a temporal section of the video data, output at least j one among the video files corresponding to the number of video segments of the requested temporal section. A client can temporally request sequential fragments from different ones of the presentations.
公开号:BR112012009832B1
申请号:R112012009832-2
申请日:2010-10-27
公开日:2021-08-31
发明作者:Ying Chen;Marta Karczewicz
申请人:Qualcomm Incorporated;
IPC主号:
专利说明:

Field of Invention
[0001] This description refers to transporting encoded video data. Description of Prior Art
[0002] Digital video capabilities can be incorporated into a wide variety of devices, including digital televisions, digital broadcast systems, wireless broadcast systems, personal digital assistants (PDAs), desktop or laptop computers, digital cameras, video devices. digital recording, digital media players, video game devices, video game consoles, satellite radio or cell phones, video teleconferencing devices, and so on. Digital video devices implement video compression techniques such as those described in the standards defined by ITU-T MPEG-2, MPEG-4, H.263 or H.264/MPEG-4 Part 10, Advanced Video Coding ( ITU-T's AVC), and extensions of such standards, to more efficiently transmit and receive digital video information.
[0003] Video compression techniques perform spatial prediction and/or temporal prediction to reduce or remove inherent redundancies in video sequences. For block-based video encoding, a video frame or slice can be partitioned into macroblocks. Each macroblock can be further partitioned. Macroblocks in an intracoded (I) frame or slice are encoded, using spatial prediction, with respect to neighboring macroblocks. Macroblocks in an intercoded frame or slice (P or B) may use spatial prediction, in relation to neighboring macroblocks in the same frame or slice, or temporal prediction, in relation to other reference frames.
[0004] After the video data is encoded, the video data can be bundled by a multiplexer, for transmission or storage. The MPEG-2 standard includes, for example, a "System" section, which defines a transport level for many video encoding standards. MPEG-2 transport level systems can be used by MPEG-2 video encoders or other video encoders conforming to different video encoding standards. For example, the MPEG-4 standard prescribes different encoding and decoding methodologies than MPEG-2, but video encoders that implement the techniques of the MPEG-4 standard can still use MPEG-2 transport level methodologies. Third Generation Partnership Project (3GPP) also provides techniques for transporting encoded video data using a particular multimedia container format for the encoded video data. Invention Summary
[0005] In general, this description describes techniques to support streaming transport of encoded video data, through a network protocol, such as, for example, hypertext transfer protocol (HTTP). A source device can form a media presentation description (MPD) file that lists multiple presentations of encoded media data. Each presentation corresponds to a different encoding for a common video. For example, each presentation may have different expectations from a target device, in terms of encoding and/or processing capabilities, as well as various average bit rates.
[0006] The source device can flag the characteristics of each presentation, allowing a target device to select one of the presentations, based on the target device's decoding and rendering capabilities, and switch between different presentations, based on the variation network environment and presentation bandwidths. Presentations can be pre-encoded or encoded in real time and stored on a server as file(s) or file fragments, compatible with eg ISO-based media file format and its extensions. The target device can retrieve data, from one or more of the presentations, at various times, over, for example, HTTP. The source device can additionally flag fragments from each presentation, such as byte ranges and corresponding temporal locations of video fragments within each presentation, so that target devices can retrieve individual video fragments from multiple presentations, based, for example, on HTTP requests.
[0007] In one example, a method for transporting encoded video data includes receiving, by a source video device, encoded video data comprising a number of video segments, forming a presentation comprising a number of video files. video, each of the video files corresponding to a respective one of the video segments and, in response to a request specifying a temporal section of the video data, output at least one of the video files corresponding to the number of video segments. video of the requested temporal section.
[0008] In another example, an apparatus, for transporting encoded video data, includes a management unit, configured to receive encoded video data, comprising a number of video segments, and form a presentation comprising a number of video files , each of the video files corresponding to a respective one of the video segments, and a network interface, configured to, in response to a request specifying a temporal section of the video data, output at least one of the files. video corresponding to the number of video segments of the requested temporal section.
[0009] In another example, an apparatus, for transporting encoded video data, includes mechanisms for receiving encoded video data, comprising a number of video segments, mechanisms for forming a presentation comprising a number of video files, each among the video files corresponding to a respective one among the video segments, and mechanisms for outputting, in response to a request specifying a temporal section of the video data, at least one among the video files corresponding to the number of video segments of the requested temporal section.
[0010] In another example, a computer-readable storage medium comprises instructions that, when executed, cause a processor, of a source device, to carry encoded video data to receive encoded video data, comprising a number of segments form a presentation comprising a number of video files, each of the video files corresponding to a respective one of the video segments and, in response to a request that specifies a temporal section of the video data, output at least one of the video files corresponding to the number of video segments of the requested temporal section.
[0011] In yet another example, a method for retrieving encoded video data includes retrieving, by a client device, presentation description data describing characteristics of a video data presentation, wherein the video data comprises a number of video segments, and where the presentation comprises a number of video files, each of the video files corresponding to a respective one of the video segments, submitting a request specifying a temporal section of the video data, for a source device, receiving, in response to the request, at least one among the video files corresponding to the number of video segments of the requested temporal section from the source device, and decoding and displaying the at least one among the video files.
[0012] In another example, an apparatus, for retrieving encoded video data, includes a network interface; a control unit configured to retrieve through the network interface, presentation description data describing characteristics of a video data presentation, wherein the video data comprises a number of video segments, and wherein the presentation comprises a number of video files, each of the video files corresponding to a respective one of the video segments, to submit a request specifying a temporal section of the video data, to a source device, and to receive, in response to the requesting at least one of the video files corresponding to the number of video segments of the requested temporal section from the source device; a video decoder configured to decode the at least one of the video files; and a user interface comprising a display configured to display the at least one decoded of the video files.
[0013] In another example, an apparatus for retrieving encoded video data includes mechanisms for retrieving presentation description data describing characteristics of a video data presentation, wherein the video data comprises a number of video segments, and wherein the presentation comprises a number of video files, each of the video files corresponding to a respective one of the video segments; mechanisms for sending a request that specifies a temporal section of the video data to a source device; mechanisms for receiving, in response to the request, at least one of the video files corresponding to the number of video segments of the requested temporal section, from the source device; and mechanisms for decoding and displaying at least one of the video files.
[0014] In another example, a computer-readable storage medium comprises instructions that, when executed, cause a processor of a device for retrieving encoded video data to retrieve presentation description data describing features of a presentation of video data, wherein the video data comprises a number of video segments, and wherein the presentation comprises a number of video files, each of the video files corresponding to a respective one of the video segments, present a request that specifies a time section of the video data, for a source device, to receive, in response to the request, at least one of the video files corresponding to the number of video segments of the requested time section of the source device, cause a video decoder, on the client device, to decode at least one of the video files, and cause an interface. User ace, from the client device, display at least one of the decoded video files.
[0015] Details of one or more examples are set forth in the accompanying drawings and in the description below. Other features, objects and advantages will be evident from the description and drawings, and from the claims. Brief Description of Drawings
[0016] Figure 1 is a block diagram illustrating an exemplary system in which an audio/video (A/V) source device carries audio and video data to an A/V destination device.
[0017] Figure 2 is a block diagram illustrating an exemplary arrangement of components of a multiplexer.
[0018] Figure 3 is a block diagram, which illustrates an exemplary set of program-specific information tables.
[0019] Figure 4 is a conceptual diagram illustrating alignment between Third Generation Partnership Project (3GPP) files from various presentations and corresponding video segments.
[0020] Figure 5 is a flowchart illustrating an exemplary method for transporting encoded video data from a source device to a destination device.
[0021] Figure 6 is a block diagram illustrating elements of an exemplary 3GPP file.
[0022] Figure 7 is a flowchart illustrating an exemplary method for requesting a fragment of a 3GPP file, in response to a search request, for a temporal location, within the 3GPP file. Detailed Description of the Invention
[0023] The techniques in this description are generally directed to support streaming video data transport using a protocol such as, for example, hypertext transfer protocol (HTTP) and the HTTP streaming application of the HTTP. In general, references to HTTP may include references to HTTP streaming in this description. This description provides a media presentation description (MPD) file that signals characteristic elements of a number of video data presentations, such as, for example, where video data fragments are stored within the presentations. Each presentation may include a number of individual files, eg Third Generation Partnership Project (3GPP) files. In general, each presentation can include a set of individual characteristics, such as, for example, a bit rate, frame rate, resolution, type of interlaced or progressive scan, encoding type (eg MPEG-1, MPEG- 2, H.263, MPEG-4/H.264, H.265, etc) or other features.
[0024] Each of the 3GPP files can be stored individually by a server and retrieved individually by a client for example, using HTTP GET requests and partial GET. HTTP GET and partial GET requests are described in R. Fielding et al, "Hypertext Transfer Protocol - HTTP/1.1", Network Working Group, RFC2616, June 1999, available at http://tools.ietf. org/html/rfc2616. According to the techniques in this description, 3GPP files of each presentation can be aligned in such a way that they correspond to the same video section, that is, the same set of one or more scenes. In addition, a server can name corresponding 3GPP files from each presentation using a similar naming scheme. In this way, an HTTP client can easily change appearances, such as changing network conditions. For example, when a large amount of bandwidth is available, the client can retrieve 3GPP files from a relatively higher quality presentation, while when a smaller amount of bandwidth is available, the client can retrieve 3GPP files from a relatively higher quality presentation. relatively lower quality presentation.
[0025] This description also provides techniques for flagging presentation characteristics and corresponding 3GPP files, summarized in an MPD file. As an example, this description provides techniques by which a server can flag characteristics such as, for example, an expected rendering capability and decoding capability of a client device for each presentation. In this way, a client device can choose between the various presentations based on the client device's decoding and rendering capabilities. As another example, this description provides techniques for signaling an average bit rate and maximum bit rate for each presentation. In this way, a client device can determine bandwidth availability and select from multiple presentations based on the determined bandwidth.
[0026] According to the techniques in this description, a server can use a naming convention that indicates 3GPP files of each presentation that correspond to the same scene. This description provides techniques for lining up 3GPP files from each performance so that each scene matches one of the 3GPP files in each performance. For example, a server can name 3GPP files of each performance corresponding to a scene with duration from time T to time T+N, using a naming convention similar to "[rogram]_preX_T_T+N", where T and T +N, in the naming convention, correspond to values for time T and time T+N, "[rogram]" corresponds to the video name, and "_preX" corresponds to an identifier of the presentation (for example, "pre2" for presentation 2). Consequently, the 3GPP files of each presentation can be aligned such that the file sizes of the 3GPP files, in the same period of time, can be used to derive the instantaneous bit rate for each presentation.
[0027] In addition, the server can signal the start time, as well as the end time and/or duration of each among the 3GPP files, for each presentation. In this way, a client can retrieve a particular 3GPP file, using an HTTP GET, based on the file name, by retrieving the start time and end time of the 3GPP file, as signaled by the server, and automatically generating the file name , based on start time and end time. In addition, the server can also signal byte ranges of each among the 3GPP files of each presentation. Consequently, the client can retrieve all or part of a 3GPP file, using a partial GET, based on the automatically generated name and a byte range of the 3GPP file to be retrieved. The client can use HTTP's HEAD method to retrieve the file size of a particular 3GPP file. In general, a HEAD request retrieves header data without corresponding body data for a URN or URL, to which the HEAD request is directed.
[0028] Figure 1 is a block diagram illustrating an exemplary system 10 in which audio/video (A/V) source device 20 carries audio and video data to A/V destination device 40. The system 10 of Figure 1 may correspond to a video teleconferencing system, a server/client system, a broadcast/receiver system, gaming system or any other system in which video data is sent from a source device, such as as an A/V source device 20, to a target device, such as an A/V target device 40. In some examples, audio encoder 26 may comprise a speech encoder, also referred to as a vocoder.
[0029] A/V source device 20, in the example of Figure 1, includes audio source 22, video source 24, audio encoder 26, video encoder 28, media presentation description management unit (MPD) 30 and network interface 32. Audio source 22 may comprise, for example, a microphone that produces electrical signals representative of captured audio data to be encoded by audio encoder 26. Alternatively, audio source 22 may comprise medium a storage device that stores previously recorded audio data, an audio data generator such as a computer synthesizer, or any other source of audio data. Video source 24 may include a video camera which produces video data to be encoded by video encoder 28, a storage medium encoded with previously recorded video data, a video data generation unit for computer graphics or any another source of video data. Raw audio and video data can include analog or digital data. Analog data can be digitized before being encoded by audio encoder 26 and/or video encoder 28.
[0030] Audio source 22 can obtain audio data from a speaking participant while the speaking participant is speaking, and video source 24 can simultaneously obtain video data from the speaking participant. In other examples, audio source 22 may comprise a computer readable storage medium comprising stored audio data, and video source 24 may comprise a computer readable storage medium comprising stored video data. In this way, the techniques described in this description can be applied to live, streaming, real-time audio and video data or to pre-recorded archived audio and video data.
[0031] Audio frames corresponding to video frames are generally audio frames that contain audio data that has been captured by audio source 22 contemporaneously with video data captured by video source 24 that is contained within the video frames. For example, while a participant who is speaking generally outputs audio data by speaking, audio source 22 captures the audio data, and video source 24 captures video data from the participant who is speaking at the same time, that is, while audio source 22 is capturing audio data. From there, an audio frame can temporally correspond to one or more particular video frames. Suitably, an audio frame corresponding to a video frame generally corresponds to a situation in which audio data and video data were captured at the same time, and for which an audio frame and a video frame respectively comprise the audio data and video data that were captured at the same time. Audio data can also be added separately, for example soundtrack information, added sounds, music, sound effects and the like.
[0032] The audio encoder 26 can encode a timestamp in each encoded audio frame that represents a time at which the audio data for the encoded audio frame was recorded and, similarly, the video encoder 28 can encode a timestamp on each encoded video frame that represents a time at which video data, for encoded video frame, is recorded. In such examples, an audio frame corresponding to a video frame may comprise an audio frame comprising a timestamp and a video frame comprising the same timestamp. The A/V source device 20 can include an internal clock, from which the audio encoder 26 and/or the video encoder 28 can generate the timestamps, or that the audio source 22 and the video source 24 can be used to associate audio and video data, respectively, with a timestamp.
[0033] In some examples, the audio source 22 can send data to the audio encoder 26, corresponding to a time at which audio data was recorded, and the video source 24 can send data to the video encoder 28, corresponding to a time at which video data was recorded. In some examples, audio encoder 26 can encode a sequence identifier in encoded audio data to indicate a relative temporal ordering of encoded audio data, but without necessarily indicating an absolute time at which the audio data was recorded, and Similarly, video encoder 28 may also use sequence identifiers to indicate a relative temporal ordering of encoded video data. Similarly, in some examples, a sequence identifier can be mapped or otherwise correlated to a timestamp.
[0034] Audio encoder 26 and video encoder 28 provide encoded data to MPD management unit 30. In general, MPD management unit 30 stores summarization regarding encoded audio and video data in the form of files MPD corresponding to the encoded audio and video data according to the techniques in this description. As discussed in more detail below, an MPD file describes a number of presentations, each presentation having a number of video files, for example, formed as Third Generation Partnership Project (3GPP) files. The MPD management unit 30 can create the same number of 3GPP files in each presentation and can align the 3GPP files of each presentation such that similarly positioned 3GPP files correspond to the same video segment. That is, similarly positioned 3GPP files can correspond to the same temporal video fragment. The MPD management unit 30 can also store data describing characteristics of the 3GPP files for each performance, such as, for example, 3GPP file durations.
[0035] The MPD management unit 30 can interact with the network interface 32 to provide video data to a client, such as A/V target device 40. The network interface 32 can implement HTTP (or other communication protocols). network) to allow the target device 40 to request individual 3GPP files, listed in an MPD file, which is stored by the MPD management unit 30. The network interface 32 can therefore respond to HTTP GET requests for files 3GPP, partial GET requests, for individual byte ranges of 3GPP files, HEAD requests to provide header information, for MPD and/or 3GPP files, and other such requests. Consequently, the network interface 32 can deliver data to the destination device 40 that is indicative of characteristics of an MPD file, such as, for example, a base name for the MPD file, presentation characteristics of the MPD file and/or characteristics of 3GPP files stored in each presentation. Data describing characteristics of a presentation of an MPD file, the MPD file itself and/or 3GPP files corresponding to the MPD file may be referred to as "presentation description data." In some examples, the network interface 32 may instead comprise a network interface card (NIC) that extracts application layer data from received packets and then passes the application layer packets to the MPD management unit 30. In some examples, the MPD management unit 30 and the network interface 32 can be functionally integrated.
[0036] In this way, a user can interact with the target device 40, through a web browser application 38, executed on the target device 40, in the example of Fig. 1, to retrieve video data. The web browser 38 may initially retrieve a first video file or fragments thereof from one of the presentations stored by the MPD management unit 30, then retrieve subsequent video fragments or files as the first file. video is being decoded and displayed by video decoder 48 and video output 44 respectively. The target device 40 may include a user interface, which includes video output 44, e.g., in the form of a display, and audio output 42, as well as other input and/or output devices, such as, e.g. , a keyboard, mouse, joystick, microphone, touch screen display, pen, light pen, or other input and/or output device. When video files include audio data, audio decoder 46 and audio output 42 can decode and display the audio data, respectively. Additionally, a user can "search" for a particular temporal location of a video presentation. For example, the user can search, in the sense that the user requests a particular temporal location within the video data, rather than watching the video file in its entirety, from start to finish. The web browser may cause a processor or other processing unit of the target device 40 to determine one of the video files that includes the temporal location of the search, then request that video file from the target device. origin 20.
[0037] In some examples, a control unit, within target device 40, may perform the functionality of web browser 38. That is, the control unit may execute instructions to web browser 38 to submit requests to the source device 20, via the network interface 36, to select among presentations of an MPD file, and to determine the available bandwidth of the network connection 34. Instructions to the web browser 38 may be stored in a computer readable storage medium. The control unit can still form requests, for example HTTP GET requests and partial GET requests, for individual 3GPP files coming from source device 20, as described in this description. The control unit may comprise a general purpose processor and/or one or more dedicated hardware units, such as, for example, ASICs, FPGAs or other hardware or circuitry or processing units. The control unit may, in some examples, further perform the functionality of any of audio decoder 46, video decoder 48 and/or any other functionality described in relation to the target device 40.
[0038] In general, presentations of an MPD file differ by characteristics, such as, for example, expected rendering capabilities of a target device, expected decoding capabilities of a target device, and average bit rate, for the video files of the presentations. The MPD management unit 30 can signal expected rendering capabilities, expected decoding capabilities and average bitrates for presentations in MPD file presentation headers. In this way, the target device 40 can determine which of the presentations from which to retrieve video files, for example, based on the rendering capabilities of the video output 44 and/or the decoding capabilities of the video decoder 48.
[0039] The target device 40 can still determine current bandwidth availability, for example, of network connection 34, and select a presentation, based on the average bit rate for the presentation. That is, when the video output 44 and the video decoder 48 have the ability to respectively render and decode video files from more than one of the presentations of an MPD file, the target device 40 can select one of the presentations, based on current bandwidth availability. Likewise, when bandwidth availability changes, target device 40 can dynamically change between supported presentations. For example, when bandwidth becomes constrained, target device 40 can retrieve a next video file from a presentation that has relatively lower bit rate video files, while when bandwidth increases, the target device 40 can retrieve a next video file from a presentation that has relatively higher bit rate video files.
[0040] By temporally aligning the video files of each presentation, dynamic switching between presentations can be simplified for target devices, such as target device 40. That is, target device 40 can, by determining that bandwidth conditions have changed, determine a period of time during which video data has already been retrieved, and then retrieve the next video file from one of the presentations, based on the bandwidth conditions. For example, if the last video file retrieved by target device 40 ends at time T, and the next file is of duration N, target device 40 can retrieve the video file, from time T to time T +N, from any of the presentations, based on bandwidth conditions, because the video files from each of the presentations are time-aligned.
[0041] In addition, the MPD management unit 30 and the web browser 38 can be configured with a common naming convention for video files. In general, each video file (for example, each 3GPP file) can comprise a name, based on a uniform resource locator (URL) in which the MPD file is stored, a uniform resource name (URN) of the file. MPD in the URL, a name of a presentation, a start time and an end time. Thus, both the MPD management unit 30 and the web browser 38 can be configured to use a naming scheme, such as, for example: "[URL]/[URN]_pre[X]_[start time]_ [end time]", where [URL] is replaced with the URL of the MPD file, [URN] is replaced with the URN of the MPD file, X is replaced with the performance number, [start time] is replaced with the start time of the 3GPP file being requested, and [end time] is replaced with the end time of the 3GPP file being requested. In other examples, the name may be based on the position of the 3GPP file within the presentation. For example, for 3GPP M file, the 3GPP file name can be automatically generated as "[URL]/[URN]_pre[X]_[M]." The web browser 38 may, in some instances, submit an HTTP partial GET request to a file, for example, by specifying the file using the naming scheme above, as well as a byte range of the file. The web browser 38 can use the HTTP HEAD method to retrieve the size of a specified file, for example, using the naming scheme above.
[0042] So, for example, for an MPD file to have the URL "www.qualcomm.com" and a URN of "program1", to retrieve the 3GPP file of presentation 3, starting at 10:02 and ending at 10:05 , the web browser 38 may submit an HTTP GET request to "www.qualcomm.com/program1_pre3_10:02_10:05". As a further example, if presentation 2 has a relatively higher bit rate than presentation 3, and the target device 40 determines that the available bandwidth has increased, after retrieving the previous exemplary 3GPP file, the web browser 38 can then submit an HTTP GET request to "www.qualcomm.com/program1_pre2_10:05_10:08".
[0043] In general, each 3GPP file can be independently decodable. Each 3GPP file can include, for example, at least one intracoded image. For example, each 3GPP file can comprise one or more groups of pictures (GOPs) or superframes, where at least one key frame, for the GOPs or superframes, is encoded using an intra-mode encoding. In this way, the web browser 38 can retrieve a 3GPP file, from any one of the presentations of an MPD file, and decode the retrieved 3GPP file, without reference to other 3GPP files of the same presentation. For example, when web browser 38 determines that available bandwidth has increased, web browser 38 may request a next 3GPP file from a presentation that has a relatively higher average bit rate than a current presentation, without retrieve temporally older 3GPP files from the presentation that has the highest average bit rate.
[0044] In this way, source device 20 can provide video data in the form of MPD files to target devices, such as target device 40. In some examples, source device 20 may include a network server. Target device 40 may comprise any device capable of retrieving data via, for example, HTTP, such as, for example, a computer or a mobile device such as a cell phone with Internet access. The source device 20 can implement the techniques in this description to transport encoded video data to the destination device 40 and to signal features of the encoded video data. The encoded video data can be encoded using any of a variety of different standards, such as, for example, MPEG-1, MPEG-2, H.263, H.264/MPEG-4, H.265 or other standards of coding.
[0045] The ITU-T H.264 standard, as an example, supports intraprediction on various block sizes such as 16 by 16, 8 by 8 or 4 by 4 for luma components, and 8x8 for chroma components as well. as an interprediction on various block sizes, such as 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 and 4x4, for luma components, and the corresponding scaled sizes for chroma components. In this description, "NxN" and "N by N" may be used interchangeably to refer to the pixel dimensions of the block in terms of vertical and horizontal dimensions, for example 16x16 pixels or 16 by 16 pixels. In general, a 16x16 block will have 16 pixels in a vertical direction (y = 16) and 16 pixels in a horizontal direction (x = 16). Likewise, an NxN block typically has N pixels in a vertical direction and N pixels in a horizontal direction, where N represents a non-negative integer value. Pixels in a block can be arranged in rows and columns. Also, video data blocks do not need to be square, for example, they can comprise NxM pixels, where N is not equal to M.
[0046] Block sizes that are smaller than 16 by 16 may be referred to as partitions of a 16 by 16 macroblock. Video blocks may comprise pixel data blocks in the pixel domain or blocks of transform coefficients in the transform domain , for example, following application of a transform, such as a discrete cosine transform (DCT), an integral transform, a Wavelet transform, or a transform conceptually similar to residual video block data representing pixel differences between encoded and encoded video blocks predictive video blocks. In some cases, a video block may comprise blocks of quantized transform coefficients in the transform domain.
[0047] Smaller video blocks can provide better resolution and can be used for locations of a video frame that include high levels of detail. In general, macroblocks and the various partitions, sometimes referred to as subblocks, can be considered video blocks. In addition, a slice can be thought of as a plurality of video blocks, such as macroblocks and/or subblocks. Each slice can be an independently decodable unit of a video frame. Alternatively, frames themselves can be decodable units, or other parts of a frame can be defined as decodable units. The terms "encoded unit" or "encoded unit" can refer to any independently decodable unit of a video frame, such as an entire frame, a slice of a frame, a group of pictures (GOP), also referred to as a sequence, or other independently decodable unit, defined in accordance with applicable coding techniques.
[0048] The term macroblock refers to a data structure for encoding image and/or video data according to a two-dimensional pixel array comprising 16x16 pixels. Each pixel comprises a chrominance component and a luminance component. Consequently, the macroblock can define four luminance blocks, each comprising a two-dimensional array of 8x8 pixels, two chrominance blocks, each comprising a two-dimensional array of 16x16 pixels, and a header comprising syntax information such as a block pattern encoded (CBP), an encoding mode (eg, intracoding (I) or intercoding (P or B) modes), a partition size for partitions of an intracoded block (eg 16x16, 16x8, 8x16, 8x8, 8x4, 4x8 or 4x4) or one or more motion vectors for an intercoded macroblock.
[0049] Video encoder 28, video decoder 48, audio encoder 26, audio decoder 46 and MPD management unit 30 each can be implemented as any one of a variety of encoder or decoder circuitry suitable, as appropriate, such as one or more microprocessors, digital signal processors (DSPs), application-specific integrated circuits (ASICs), field-programmable gate array (FPGA), discrete logic circuitry, software, hardware , firmware or any combinations thereof. Each of video encoder 28 and video decoder 48 may be included in one or more encoders or decoders, one of which may be integrated as part of a combined video encoder/decoder (CODEC). Likewise, each of audio encoder 26 and audio decoder 46 can be included in one or more encoders or decoders, one of which can be integrated as part of a combined CODEC. An apparatus including video encoder 28, video decoder 48, audio encoder 26, audio decoder 46, MPD management unit 30 and/or hardware running a web browser application 38, may comprise an integrated circuit , a microprocessor and/or a wireless communication device such as a cell phone.
[0050] Figure 2 is a block diagram illustrating an exemplary arrangement of components of the MPD management unit 30 (Figure 1). In the example of figure 2, MPD management unit 30 includes MPD creation unit 60, video input interface 80, audio input interface 82, MPD file storage 84 and MPD output unit 70. Creation unit MPD 60 includes parameter signaling unit 62 and 3GPP file management unit 64, while MPD output unit 70 includes 3GPP file retrieval unit 72 and HTTP server unit 74, in the example of figure 2.
[0051] Video input interface 80 and audio input interface 82 retrieve encoded audio and video data, respectively. Video input interface 80 and audio input interface 82 can receive encoded audio and video data as the data is encoded, or can retrieve encoded audio and video data from a computer readable medium. Upon receiving encoded audio and video data, the video input interface 80 and the audio input interface 82 pass the encoded audio and video data to the MPD creation unit 60 for mounting into an MPD file.
[0052] The MPD creation unit 60 receives the encoded audio and video data to create MPD files. The MPD creation unit 60 can receive encoded video data in a variety of different ways, for example, different sizes (resolutions) of video data, video data encoded in accordance with a variety of encoding standards, video data encoded in various frame rates and/or bit rates or other variations. The MPD creation unit 60 can receive encoded video data for each of several presentations to be stored in an MPD file.
[0053] The 3GPP file management unit can create individual 3GPP files for each presentation of an MPD file in such a way that the 3GPP files of each presentation are time-aligned. That is, the first 3GPP file in each presentation corresponds to the same video fragment, with the same duration, start time and end time. 3GPP files corresponding to the same video fragment that have the same start time and end time are referred to as corresponding 3GPP files. As presentations can have different frame rates, corresponding 3GPP files of the presentations can include different numbers of encoded images. Likewise, because encoding methodologies and bitrates may differ between presentations, corresponding 3GPP files may have different file sizes.
[0054] In some examples, the 3GPP file management unit 64 can build 3GPP files for each presentation such that each 3GPP file has the same temporal duration. In these examples, the parameter signaling unit 62 can signal the duration of all 3GPP files for an MPD file, using a value representative of the common duration. In other examples, the 3GPP file management unit 64 can build the 3GPP files such that corresponding 3GPP files between presentations have the same duration, but the 3GPP files within a presentation can have individual durations. In such examples, the parameter signaling unit 62 can signal the duration for each 3GPP file. The parameter signaling unit 62 can also signal the start time, end time and/or duration of each 3GPP file, for a presentation.
[0055] Parameter signaling unit 62 can also signal other characteristics of an MPD file, 3GPP files included in the MPD file and the performances of the MPD file. For example, the parameter signaling unit 62 can signal expected decoding capabilities of a decoder of a target device, for each presentation. Decoding capabilities can include, for example, decoding methodologies such as the encoding standard used to encode the presentation video data, a minimum macroblock decoding rate, a minimum frame decoding rate, a store size frame or block and/or other expected decoding capabilities. In some examples, the parameter signaling unit 62 can signal the expected decoding capabilities using a profile indicator value (profile IDC) and a level indicator value (level IDC).
[0056] In the context of video encoding standards, a "Profile IDC" value may correspond to a subset of algorithms, features or tools and restrictions that apply to them. As defined by the H.264 standard, for example, a "Profile IDC" value describes a subset of the entire bitstream syntax, which is specified by the H.264 standard. A value of "Level IDC" describes the limitations of consuming decoder resources, such as, for example, memory and decoder computation, which are related to image resolution, bit rate and macroblock processing rate (MB ).
[0057] Parameter signaling unit 62 can also signal expected rendering capabilities, for each presentation. For example, the parameter signaling unit 62 can signal an image width, image height and/or frame rate for each presentation.
[0058] MPD creation unit 30 can store created MPD files, along with 3GPP files for each presentation, and flagged characteristics for MPD file, presentations and each 3GPP file, for storage of MPD 84 file. MPD file 84 may comprise a computer readable storage medium, such as, for example, a hard disk, a solid state drive, magnetic tape, optical storage media, or any other storage medium, or a combination thereof.
[0059] The MPD output unit 70 can receive and respond to HTTP requests from target devices, such as, for example, target device 40. In the example of Fig. 2, it is assumed that the network interface 32 (Fig. 1) extracts application layer data, which may include HTTP requests, from received network packets and passes the extracted application layer data to the MPD management unit 30. The MPD output unit 70 determines generally what the request is requesting retrieves the requested data from the MPD file storage 84 and provides the requested data back to the requesting device, eg, the target device 40.
[0060] The HTTP server unit 74 in this example implements HTTP to receive and interpret HTTP requests such as GET, partial GET and HEAD requests. Although this description describes the HTTP example, for purposes of illustration, other network protocols can also be used with the techniques described in this description.
[0061] The HTTP server unit 74 can interpret incoming HTTP requests to determine the data requested by the requests. The request may specify, for example, a particular 3GPP file or part of a particular 3GPP file, header data describing characteristics of the MPD file, presentations from the MPD file and/or 3GPP files from a presentation, or other data from an MPD. The HTTP server unit 74 can then pass an indication of the requested data to the 3GPP file retrieval unit 72. The 3GPP file retrieval unit 72 can retrieve requested data from the MPD file store 84 and return the retrieved data to the HTTP server unit 74, which can bundle the data into one or more HTTP packets and send the packets to network interface 32. Network interface 32 can then encapsulate the HTTP packets and output the packets for, for example, the target device 40.
[0062] When the signaling unit of parameter 62 signals start times and durations of 3GPP files, for an MPD file, target devices, such as target device 40, can determine 3GPP files that match each other among several presentations of a file. MPD. For example, the HTTP server unit 74 can respond to HTTP HEAD requests with data describing characteristics of an MPD file, one or more performances of the MPD file, and/or one or more 3GPP files of the performances of the MPD file. The target device 40 can also retrieve the corresponding 3GPP file file sizes, to derive instantaneous bit rates, for each presentation, for example, by dividing the file sizes, by the temporal length of the video segment for which the files 3GPP match.
[0063] Streaming bandwidth adaptation over HTTP can be implemented as follows. At the particular time T, the destination device 40 may switch to the stream (e.g., presentation) with the closest bit rate, but less than a desired bit rate. The instantaneous bitrate of a presentation can be calculated by mapping time T to a current 3GPP file of a current presentation. To do this, assuming each 3GPP file has the same deltaT temporal length, the target device can calculate an identifier M, for the 3GPP file, as M = T/deltaT. Target device 40 can then generate a filename for the 3GPP file and retrieve the 3GPP file length as uiFileLength.
[0064] The target device 40 can also retrieve a temporal duration for the 3GPP file from the MPD file. This temporal duration can be referred to as uiDuration, which is assumed to be described in seconds in this example. The target device 40 can then calculate the instantaneous bitrate, such as bitrate = 8.0 * uiFileLength/uiDuration/100, which results in the bitrate value having units of kbps. By checking all bitrate values, for each performance of the same scene, the value that is closest to but less than a desired bitrate. The corresponding presentation can then be the target presentation, to which the target device 40 must switch. That is, the target device 40 can start retrieving 3GPP files from the target presentation.
[0065] In addition, the HTTP server unit 74 can be configured to recognize 3GPP files, based on a specific naming scheme, such as "[URN]_pre[X]_[start time]_[end time]" , where [URN] is replaced with the URN of the MPD file, X is replaced with the performance number, [start time] is replaced with the start time of the 3GPP file being requested, and [end time] is replaced with the end time of the 3GPP file being requested. Thus, the HTTP server unit 74 may, in response to a request identifying a 3GPP file using this naming scheme, send to the 3GPP file retrieval unit 72 an indication to retrieve the 3GPP file corresponding to the identified MPD file by [URN] from presentation _pre[X] which has a start time of [start time] and an end time of [end time]. Similarly, 3GPP file recovery unit 72 can retrieve this requested 3GPP file . The target device 40 can automatically generate the 3GPP filename for inclusion in the request, based on the start time and end time of the 3GPP files, as well as the presentation from which to retrieve the 3GPP files.
[0066] In some examples, the HTTP server unit 74 may receive an HTTP partial GET request that specifies a byte range of a file identified according to the naming scheme above. The HTTP server unit 74 can provide a file byte range indication to the 3GPP file recovery unit 72, which can retrieve only the file data corresponding to the requested byte range and provide the recovered data to the server unit HTTP 74. The HTTP server unit 74 can similarly encapsulate this data and send the data to the network interface 32, which can further encapsulate the data and transmit the data over connection 34.
[0067] Although, in the example in Figure 2, the MPD management unit is shown to include both the MPD 60 creation unit and the MPD 70 output unit, in other examples, separate devices can be configured to run the functionality assigned to MPD creation unit 60 and MPD output unit 70. For example, a first device can be configured to encapsulate the encoded video data in the form of an MPD file and signal parameters of the MPD file, while a second device can be configured as a network server to provide access to MPD files created by the first device. Likewise, an encoding device, separate from the source device 20, can encode raw video data and send the encoded video data to the MPD management unit 30 of the source device 20. In general, any of the functionality assigned to source device 20 may be included in common or separate devices and/or units of the devices. The MPD file storage 84 may, in some instances, correspond to an external storage device, such as, for example, an external hard disk or an external file server.
[0068] Figure 3 is a block diagram illustrating the data stored within an exemplary MPD 90 file. The parameter signaling unit 62 can signal information to the MPD file 90, such as, for example, the uniform resource locator (URL) value 92, which represents the URL at which the MPD is stored (for example, "www .qualcomm.com/media"), duration value 94, which represents the temporal duration of the video data for the MPD file 90, and a base 96 Uniform Resource Name (URN) value, which corresponds to the file name MPD 90, eg "program1". The MPD 90 file includes a number of performances similar to the 100A performance example. For the example of the presentation 100A of the MPD file 90, the signaling parameter unit 62 may signal presentation identifier 102A, incremental URN 104A (e.g., "_preX"), which a target device may use to refer to the presentation 100A , decoding capability expected values, including, for example, profile IDC value 106A and level IDC value 108A, and rendering capability expected values, including, for example, frame rate value 110A, width value of 112A image and/or 114A image height value.
[0069] Parameter signaling unit 62 can also signal timing information for presentations such as 100A presentation. Timing information attributes 116A can include, for example, a number of entries in presentations (which must be the same for each presentation), 3GPP file durations corresponding to 3GPP 118A file identifiers, and a number of 3GPP files that have the same duration . In some examples, the parameter signaling unit 62 may signal this data for only one display (eg display 100A), although the data can be assumed to be the same for each of the other displays. The following pseudocode can describe a part of a data structure that can be used to signal timing characteristics of a presentation:

[0070] In exemplary pseudocode, “number_entry” represents the continuous group number of 3GPP files for the same presentation. Parameter signaling unit 62 can set the value of number_entry to 1 when all durations of movie fragments are the same. The "deltaT" value represents the duration of the 3GPP file in the i-th entry of the continuous group of 3GPP files. The value "numFileWithSameDuration" represents the number of continuous 3GPP files in the i-th entry. Parameter signaling unit 62 can set the value of "numFileWithSameDuration" equal to 0 to indicate that all 3GPP files in the presentation have the same deltaT duration. For examples that correspond to live streaming, the signaling unit of parameter 62 can set the value of “number_entry” to 1, to indicate that all 3GPP files have the same duration.
[0071] Presentation 100A also includes a number of 3GPP file identifiers 118, which correspond to 3GPP files built by the 3GPP file management unit 64. Each of the presentations 100, of MPD file 90, can include the same number of 3GPP files, which can be time-aligned.
[0072] Using this flagged data, the target device 40 can automatically generate 3GPP filenames, to submit HTTP GET and partial GET requests. That is, the target device 40 can automatically generate the URN for 3GPP files. For example, assuming that base 96 URN of MPD file 90 is "program1", and that presentation identifier 102A of presentation 100A is "_pre2", then the common part of URN of 3GPP files corresponding to 3GPP file identifiers 118 is "program_pre2". For the Mth of the 3GPP 118A file handles, in submission 100A, in this example, the target device 40 may submit an HTTP GET request for "program_pre2_M". For example, to retrieve the 45th 3GPP file, the target device 40 might present an HTTP GET request for "program1_pre2_45.3gp". Alternatively, the target device 40 may present an HTTP GET request for "program1_pre2_Mstart_Mend", where Mstart corresponds to the start time of the Mth among 3GPP files corresponding to 3GPP file identifiers 118A, and Mend corresponds to the end time of the Mth among 3GPP files corresponding to 3GPP file identifiers 118.
[0073] An HTTP client, such as target device 40, may also look for a time T of a presentation, such as presentation 100A. To retrieve the Mth among the 3GPP files corresponding to the 3GPP file identifiers 118A that correspond to the seek time T, the target device 40 can calculate M as M = T / deltaT, where deltaT can be signaled within information attributes of timing 116A, as described above, assuming that each of the 3GPP files has the same duration. On the other hand, if the 3GPP files do not have the same duration, the target device 40 can retrieve durations for each of the 3GPP files to determine which of the 3GPP files corresponding to the 3GPP file identifiers 118A to retrieve. After calculating M, the target device 40 can retrieve the Mth among the 3GPP files corresponding to the 3GPP file identifiers 118A.
[0074] When 3GPP files from each of the 100 presentations are temporally aligned, the target device 40 can substitute the identifier of any one of the 100 presentations for "_pre2" in the example above to retrieve 3GPP files from any of the presentations. As an example, suppose that MPD file 90 has five presentations, and that the target device 40 is capable of decoding and rendering any one of presentations 2, 3, or 4. Let's further assume that presentation 2 is of relatively low quality ( and therefore low average bitrate), presentation 3 has higher quality and high average bitrate, and presentation 4 has higher quality and even higher average bitrate. Initially, the target device 40 can determine that the available bandwidth is relatively low, such that it can retrieve the 3GPP file "1" from presentation 2, for example, using an HTTP GET request to "program1_pre2_1. 3gp". The target device 40 can then determine that the available bandwidth has increased and thus retrieve the next 3GPP file using an HTTP GET request for "program1_pre3_2.3gp". The target device 40 can then determine that the available bandwidth has increased further and thus retrieve the next 3GPP file using an HTTP GET request for "program1_pre4_3.3gp".
[0075] Figure 4 is a conceptual diagram illustrating alignment between 3GPP files, 138, 144, of multiple presentations, 134, 140, and video segments 130 of video 146. Video 146 includes video segments 130A-130N ( video segments 130). The example in Figure 4 illustrates presentations 134 and 140 in this example and may include additional presentations as indicated by the ellipses between presentations 134 and 140. Presentation 134 includes header data 136 and 3GPP files 138A-138N (3GPP files 138). Presentation 140 includes data from header 142 and 3GPP 144A-144N files (3GPP 144 files).
[0076] Although 3GPP 138 files are different in size than 3GPP 138 files, 3GPP 144 files are temporally aligned with video segments 130 of video 146. In the example in Figure 4, 3GPP file 138A and 3GPP file 144A correspond to video segment 130A, 3GPP file 138B and 3GPP file 144B correspond to video segment 130B, 3GPP file 138C and 3GPP file 144C correspond to video segment 130C, and 3GPP file 138N and 3GPP file 144N correspond to the 130N video segment. That is, the 3GPP file 138A and the 3GPP file 144A, for example, include video data that, although potentially encoded and/or rendered differently, generally corresponds to the same scenes as the video segment 130A.
[0077] Data header, 136, 142, may generally include data descriptive of presentations, 134, 140, respectively. Data header, 136, 142, may include data similar to display identifier 102A, incremental URN 104A, profile IDC value 106A, level IDC value 108A, frame rate value 110A, image width value 112A, image height value 114A and timing information assignment 116A of Fig. 3. An MPD file describing 3GPP files, 138, 144, of presentations 134-140 may also include header data (not shown) describing characteristics of the file. MPD, presentations, 134, 140, and 3GPP files, 138, 144.
[0078] In this way, target device 40 (Fig. 1) can retrieve header data, 136, 142, to determine if target device 40 is capable of decoding and displaying 3GPP 138 video data and/or 3GPP files 144. Assuming that target device 40 is capable of decoding and processing data from both 3GPP 138 files and 3GPP 144 files, target device 40 can select between presentation 134 and presentation 140, based on the availability of width. band. For example, assuming that presentation 134 has an average bit rate lower than presentation 140, target device 40 may initially retrieve 3GPP file 138A from presentation 134 when bandwidth availability is relatively low. . Assuming then that target device 40 determines that bandwidth availability has increased, target device 40 can then retrieve the 3GPP file 144B. As the 3GPP files 138A and 144A are time-aligned, the target device 40 can decode and process encoded video data from 3GPP files 144B in an integrated manner such that a user of the target device 40 is able to see the 130A video segment, followed immediately by the 130B video segment, albeit with potentially varying qualities.
[0079] Figure 5 is a flowchart illustrating an exemplary method for transporting encoded video data from a source device to a destination device. For purposes of example, the method of Figure 5 is explained, in relation to the source device 20 and the destination device 40, although it should be understood that other devices can be configured to perform the method of Figure 5.
[0080] In the example of Fig. 5, the source device 20 receives encoded video data corresponding to several video segments of a video. The source device 20 encapsulates the encoded video data in an MPD file (180). For example, source device 20 can determine encoded video frames from a variety of different presentations that correspond to a common video segment, as shown in Figure 4, and encapsulate the video frames in one or more respective 3GPP files, of several presentations. The source device 20 can also flag characteristics of MPD file, presentation and 3GPP files in header portions of MPD file, presentations and/or 3GPP files (182). The characteristics may correspond to the exemplary characteristics, illustrated in figure 3.
[0081] A user of target device 40 may initially retrieve the MPD file from a link on a web page or from an embedded video web page, for example, using a web browser 38 ( figure 1). The target device 40 may request features from the MPD file, after the user has asked to view the video (184). For example, target device 40 may issue an HTTP HEAD request to source device 20 for a given MPD file. In response, source device 20 may provide data indicative of characteristics of the MPD file (186). This data can indicate, for example, the number of presentations for the MPD file, decoding capabilities and rendering capabilities, for each presentation that is expected for the target device 40, to be able to decode and render the respective 3GPP files. presentation, average bitrates for each presentation, a duration of the 3GPP files for the presentations (when each of the 3GPP files has the same duration), durations of each of the 3GPP files for one or more of the presentations (when the 3GPP files can have different lengths within a performance but are time-aligned between different performances), incremental uniform resource names for each performance, or other characteristics of the MPD file.
[0082] The target device 40 can analyze the expected decoding and rendering capabilities to determine which of the presentations can be decoded and rendered by the target device 40 (188). For example, target device 40 can determine whether video decoder 48 satisfies a Profile IDC value and a Level IDC value, indicated by the decoding capabilities expected in the characteristics of received MPD files. Target device 40 can also determine whether video output 44 is capable of displaying video data at the frame rate indicated by the expected renderability value, and whether the size of video output 44 matches the image height values. and/or image width of the expected value of rendering capabilities. In some examples, video decoder 48 and/or video output 44 may up-sampling or down-sampling decoded images in order to properly fit within the size of video output 44. Likewise, video decoder 48 and /or the video output 44 may interpolate or decimate (or skip) frames of decoded video data to match an update rate of the video output 44. The target device 40 may record indications of which presentations from the MPD file they can be decoded and rendered on a local computer readable storage medium, eg random access memory (RAM) of the target device 40.
[0083] The destination device 40 can then determine a relative amount of bandwidth of a network between itself and a source device 20 (190). Target device 40 can generally use any known techniques for estimating available bandwidth to determine how much bandwidth is available. For example, the destination device 40 may additionally or alternatively estimate the round-trip delay (for example, by issuing an Internet Control Message Protocol (ICMP) ping request to the source device 20) , average packet loss or packet corruption (for example, by analyzing lost or corrupt packets according to Transmission Control Protocol (TCP) statistics) or other network performance metrics.
[0084] The target device 40 can then select one of the presentations from which to start retrieving 3GPP files (192). The target device 40 can select the one among the presentations for which the video decoder 48 meets the expected decoding capabilities, and for which the video output 44 meets the expected rendering capabilities. When the target device 40 is capable of decoding and rendering encoded video data from more than one presentation, the target device 40 can select among such potential presentations based on the determined amount of bandwidth by comparing the average bit rates of the presentations. one with the other. Target device 40 may be configured with a function that positively relates average bit rate to available bandwidth, such that target device 40 selects a presentation that has a relatively lower average bit rate when available bandwidth is low, but selects a presentation that has a relatively higher average bit rate when available bandwidth is high.
[0085] After selecting a presentation, the target device 40 may request a 3GPP file from the presentation (194). Destination device 40 may select the first 3GPP file or a 3GPP file including a seek-to time (i.e., a temporal location that corresponds to a position at which a user has requested to search within the video data). To request the 3GPP file, target device can construct an HTTP GET request, which specifies a 20 source device URL, an MPD file URN, a presentation identifier and a 3GPP file identifier. The 3GPP file identifier can match a numerical identifier of the 3GPP file or include at least one of a start time and/or an end time. In some examples, target device 40 may construct a partial GET request, for example, when the 3GPP file that includes the time for which the user searches is relatively long (eg, close to 60 seconds).
[0086] After receiving the HTTP GET or partial GET request, the originating device 20 can retrieve and send the requested 3GPP file (or part of the requested 3GPP file) (196). The source device 20 can send the 3GPP file in one or more HTTP packets to the destination device 40. Each of the HTTP packets can be further encapsulated, for example, according to TCP/IP. After receiving and remounting the requested 3GPP file, the target device 40 can decode and display the 3GPP file (198). That is, the web browser 38 of the target device 40 can send the 3GPP file to the video decoder 48 for decoding, which can send the decoded video data to the video output 44 for display.
[0087] The target device 40 can then determine if the decoded and displayed video file was the last 3GPP file of the video (200). The target device 40 can determine that the 3GPP file was last when the end of the video is reached, or when a user chooses to stop watching the video. If the decoded and displayed video file was not the last video file ("NO" branch 200), the target device 40 may re-evaluate the available bandwidth (190), select a presentation based on the amount of bandwidth. newly determined band (192) and request a next 3GPP file from the selected performance (194). On the other hand, if the decoded and displayed video file is the last video file ("YES" branch 200), the method may terminate.
[0088] Figure 6 is a block diagram illustrating elements of an exemplary 3GPP file 220. The destination device 40 can use data from the 3GPP file 220 to fetch a requested time within the 3GPP file 220. In general, 3GPP files can include video data corresponding to any period of time, for example, between two seconds and sixty seconds, or even longer or shorter. When the time period for a 3GPP file is relatively short (eg about two seconds), the target device 40 can be configured to retrieve the entire 3GPP file, which includes a seek time, i.e. video data in which to start displaying the video data, as requested by, for example, a user. On the other hand, when the time period for a 3GPP file is longer (eg closer to 60 seconds), the target device 40 can be configured to retrieve a portion of the 3GPP files, to decode and display that it is close. of fetch time, for example, using an HTTP partial GET request.
[0089] 3GPP file 220 includes Movie Box (MOOV) 222 and Movie Fragment Random Access Box (MFRA) 230. MOOV Box 222 generally includes encoded video data, while MFRA Box 230 includes descriptive data to watch with random access data inside the MOOV box 222. In the example in Figure 6, MOOV box 222 includes metadata for the entire file 224 and possibly video fragments 226A-226C (video fragments 226), while MFRA box includes track fragment random access (TFRA) box 232, which includes fragment signaling data 234, and film fragment random access (MFRO) shift box 236, which includes MFRA size value 238.
[0090] The MFRA 238 size value describes the length of the MFRA 230 box, in bytes. In the example in Figure 6, the 3GPP 220 file is N bytes long. The target device 40 may submit an HTTP HEAD request to the 3GPP file 220 to determine the length of the 3GPP file 220, for example the value of N in this example. In general, the MFRO box 236 occupies the last four bytes of the 3GPP file 220. Consequently, to determine the length of the MFRA box 230, client devices, such as target device 40, can retrieve the last four bytes of the 3GPP file 220. , for example, using an HTTP partial GET request that specifies a byte range, from [N - 4] to N. While MFRO box 236 includes the size value of MFRA 238, target device 40 can determine the length of the MFRA 230 box after retrieving the MFRO 236 box.
[0091] After determining the length of the MFRA box 230 using the size value of MFRA 238, the target device 40 can retrieve the remaining part of the MFRA box 230. For example, the target device 40 can issue a partial GET HTTP to the 3GPP 220 file that specifies a byte range, from [N - MFRA size] to [N - 4]. As shown in Fig. 6, this part includes the TFRA box 232, which includes fragment flags 234. Fragment flags 234 can specify, for example, temporal locations of video fragments 226. Header data 224.
[0092] The destination device 40 may use the fragment flags 234 to determine which of the video fragments 226 to retrieve in order to satisfy a seek request. That is, the target device 40 can determine which of the video fragments 226 includes the time specified in the search request. Target device 40 may retrieve header data 224 to determine the byte ranges for each of the video fragments 226. After determining which of the video fragments 226 includes the time specified in the seek request, based on cues of fragment 234, the destination device 40 can retrieve the one of the video fragments 226, which includes the time specified in the search request, as well as each of the subsequent video fragments 226.
[0093] In some examples, the target device 40 may present a first partial GET request for one of the video fragments 226, which includes the time specified in the seek request, start decoding and display this video fragment after receive and then submit one or more additional partial GET requests to retrieve the subsequent ones among the video fragments 226. In other examples, the target device 40 may present a partial GET request to retrieve the one among the fragments of video 226, which includes the time specified in the search request and each of the subsequent ones among the 226 video fragments, for example, by specifying the byte interval corresponding to the beginning of the one among the 226 video fragments, through [N - size of MFRA].
[0094] Figure 7 is a flowchart illustrating an exemplary method for requesting a fragment of a 3GPP file, in response to a search request for a time within the 3GPP file. Initially, target device 40 may receive a request to search for a particular time within a video (250), for example, via web browser 38, from a user. For example, the user can select a portion of a scroll bar, indicative of the temporal location of video data, to request a search for a particular temporal location.
[0095] In response, the target device 40 may determine a 3GPP file from a presentation of an MPD file that includes the seek time (252). That is, the target device 40 can determine one of the 3GPP files of the presentation that has a start time less than the seek time and an end time greater than the seek time. For purposes of illustration, the method of Figure 7 is discussed in relation to the 3GPP file 220 of Figure 6, which can match any one of 3GPP files corresponding to 3GPP file identifiers 118, 3GPP files 138, or 3GPP files 144. Assume Make sure that the 3GPP file 220 has a start time less than the seek time and an end time greater than the seek time. The target device 40 can identify the 3GPP file 220 based on timing information attributes for the 3GPP file 220 stored within a header part of a presentation that includes the 3GPP file 220. The difference between the seek time and the The initial time of the 3GPP 220 file can be referred to as a time offset.
[0096] The destination device 40 can then determine a length of the 3GPP file 220, for example, by issuing an HTTP HEAD request, which specifies the 3GPP file 220 to the source device 20. After determining the 3GPP file length 220, in bytes (for example, N bytes), the target device 40 can issue an HTTP partial GET request, which specifies the 3GPP file 220 and byte range, starting from [N - 4] to N, from file 3GPP 220, to retrieve box MFRO 236 from file 3GPP 220 (254).
[0097] As illustrated in the example of Fig. 6, the MFRO box 236 includes the MFRA size value 238. Thus, the destination device 40 can use the size value of MFRA 238, after receiving the MFRO box 236, to retrieve the rest of the MFRA 230 box (256). That is, the target device 40 can issue an HTTP partial GET for the byte range [N - MFRA size] to [N - 4], of the 3GPP file 220. In this way, the target device 40 can retrieve the data remaining MFRA, based on MFRO data. Destination device 40 may also retrieve MOOV header data, for example, header 224, from MOOV box 222 (258).
[0098] The destination device 40 can use header data 224 and the fragment flags 232 of the MFRA box 230 to determine that one of the video fragments 226 has a start time that is closer to the seek time, without exceeding the search time (260). The target device 40 may then issue one or more HTTP partial GET requests to retrieve the one of the video fragments 226 and each of the subsequent video fragments 226 from the 3GPP file 220 (262). That is, using indications from header 224 and fragment flags 232, destination device 40 can determine the start byte at which the one of video fragments 226 has a start time that is closest to the seek time , without exceeding the search time. The target device 40 can then construct an HTTP partial GET request, which specifies this start byte and is either the last byte of this one of the video fragments 226 or the end of the MOOV box 222, in several examples.
[0099] Although the method of Figure 7 is described with respect to using data from the MFRA box 230, the target device 40 can use other data to perform a similar technique to extract video fragments 226 from the 3GPP file 222. For example, the target device 40 can determine which of the video fragments 226 to extract, based on an item location box (ILOC) of a 3GPP file. A 3GPP file can be constructed such that the first four bytes, for example, include an item shift box (ILOO), followed immediately by the ILOC. The ILOO box can specify the length of the ILOC box. The ILOO box can be constructed according to the following exemplary pseudocode:

[0100] In exemplary pseudocode, ItemLocationBoxOffset describes the name of a new class for the ILOO box. The example specifies a 32-bit integer value "size", which is indicative of the size of the ILOC box. The ILOC box size value, specified by the ILOO box, can include the four bytes of the ILOO box.
[0101] The ILOC box can specify a timing information box, which indicates timing information of video fragments included in the 3GPP file, for example, the start and end times of each fragment. Timing information can be signaled in an automatic way, for example to save bits. The ILOC box can also include other data descriptive of the MOOV box of the 3GPP file, for example, data similar to that stored by header 224 of figure 6. The ILOC and ILOO boxes can be used to indicate timing information for film fragments. Thus, the target device 40 can retrieve the video fragments to satisfy a search request by constructing one or more HTTP partial GET requests, based on the data from the ILOC and ILOO boxes.
[0102] In particular, the target device 40 can first retrieve the first four bytes of the 3GPP file, which correspond to the size value for the ILOC box. That is, the target device 40 may first issue a partial HTTP GET request, for bytes 0 to 4, from the 3GPP file, to retrieve the ILOO box. When using the ILOC box size, specified by the ILOO box, the target device 40 can retrieve the ILOC box, for example, by issuing a partial HTTP GET request that specifies bytes from 4 to [ILOC size].
[0103] The ILOC box can specify the position and length of a timing information box (also referred to as an ILOC box string) that indicates the byte ranges and timing locations for each video fragment, eg time start, end time, start byte and end byte. Thus, the target device 40 can retrieve the timing information box, based on data from the ILOC box. Target device 40 can then determine which of the video fragments include a start time less than the seek time and an end time greater than the seek time, and issue one or more HTTP partial GET requests to retrieve this and subsequent video fragments from the 3GPP file.
[0104] The timing information box can be implemented according to the following exemplary pseudocode:

[0105] In exemplary pseudocode, the value of “number_entry” describes the number of continuous movie fragments from the 3GPP file. number_entry can be set to a value of 1 to indicate that all movie fragment durations are the same. The value of "deltaTFragment" can generally describe the duration of the fragments of the i-th entry of the continuous group of movie fragments in the 3GPP file. The value of "numContinueFragWithSameDuration" describes the number of continuous movie fragments in the i-th entry. When the value of "numContinueFragWithSameDuration" equals 0, it indicates that all 3GPP files in the presentation have the same deltaT duration.
[0106] In one or more examples, the functions described can be implemented in hardware, software, firmware or any combination of these. If implemented in software, the functions can be stored or transmitted via one or more instructions or code in a computer-readable medium. Computer-readable media may include computer-readable storage media, such as data storage media or communication media, including any media that facilitates the transfer of a computer program from one place to another. Data storage means can be any available means that can be accessed by one or more computers or one or more processors, to retrieve instructions, code and/or data structures for implementing the techniques described in this description. The phrase "computer-readable storage media" is intended to refer to tangible, non-transient, computer-readable storage media that may correspond to an article of manufacture. By way of example, and not limitation, such computer readable media may comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to carry or store desired program code, in the form of instructions or data structures, and which can be accessed by a computer. Furthermore, any connection is appropriately called a computer-readable medium. For example, if the software is transmitted from a website, server or other remote source that uses coaxial cable, fiber optic cable, twisted pair, digital subscriber line (DSL) or wireless technologies such as infrared, radio, microwave, then coaxial cable, fiber optic cable, twisted pair, DSL, or wireless technologies such as infrared, radio, microwave, are included in the medium definition. Disc (disk) and disc (disc), as used herein, include compact disc (CD), laser disc, optical disc, digital versatile disc (DVD), floppy disk, and blu-ray disc, where discs (disks) typically reproduce data magnetically , while discs reproduce data optically with lasers. Combinations of the above should also be included within the scope of computer readable media.
[0107] The code can be executed by one or more processors, such as one or more digital signal processors (DSP), general purpose microprocessors, application specific integrated circuits (ASIC), field programmable logic arrays (FPGAs) or another set of equivalent discrete or integrated logic circuits. Accordingly, the term "processor" as used herein may refer to any of the foregoing frameworks or any other framework suitable for implementing the techniques described herein. In addition, in some respects, the functionality described here may be provided within dedicated hardware and/or software modules configured for encoding and decoding or incorporated into a combined codec. Also, the techniques can be fully implemented in one or more circuits or logic elements.
[0108] The techniques in this description can be implemented in a wide variety of devices or appliances, including a cordless handset, an integrated circuit (IC) or a set of ICs (eg, a chipset). Various components, modules or units are described in this description to emphasize functional aspects of devices configured to carry out the disclosed techniques, but which do not necessarily require realization by different hardware units. Instead, as described above, multiple units can be combined into one codec hardware unit or provided by a collection of interoperable hardware units, including one or more processors, as described above, together with suitable firmware and/or software .
[0109] Several examples have been described. These and other examples are within the scope of the claims that follow.
权利要求:
Claims (4)
[0001]
1. Method for providing information for at least first and second presentations of encoded video data, the method characterized in that it comprises: receiving video files for each of the first and second presentations of a media presentation description file, MPD , in which the files from the first and second performances are temporally aligned, so that, for the same scenes, the files from the first performance correspond to the files from the second performance and have the same duration, start time and end time; providing an MPD with timing information comprising a value indicative of a number of entries and a corresponding number of entries with a file duration and a number of continuous files with that file duration, to a client device; signaling in the MPD a first rendering capability, a first decoding capability and a first average bit rate for the first presentation to the client device; signaling in the MPD a second rendering capability, a second decoding capability and a second average bit rate for the second presentation to the client device; provide in the MPD a uniform resource finder for each of the files of the first and second presentations, the uniform resource finder, including a replaceable identifier of any of the presentations; receive an HTTP request specifying the uniform resource locator of one of the files in the first presentation or second presentation of the client device, where the request comprises a partial hypertext transfer protocol (HTTP) GET request, specifying a byte range of one of the video files of the specified presentation; and send the requested file to the client device using HTTP.
[0002]
2. Method according to claim 1, characterized in that it additionally comprises: determining the start time of the files; and provide start time signaling of files to the client device.
[0003]
3. Method for receiving encoded video data, the method characterized in that it comprises: receiving a media presentation description file, MPD, with information indicative of a temporal duration for video files for each of the first and second presentations of a media presentation description file, MPD, in which the files from the first and second presentations are time-aligned so that, for the same scenes, the files from the first presentation match the files from the second presentation and have the same duration, start time and end time, the information including time information comprising a value indicative of a number of entries and a corresponding number of entries with a file duration and several continuous files with that duration; receiving in the MPD a first rendering capability, a first decoding capability and a first average bitrate for the first presentation on the client device; receiving in the MPD a second rendering capability, a second decoding capability and a second average bit rate for the second presentation on the client device; receive in the MPD an indication of a uniform resource finder, the uniform resource finder including a replaceable identifier of any of the presentations, and generate, based on the time information and the uniform resource finder, a request for one of the files in the presentations where the request comprises a partial hypertext transfer protocol (HTTP) GET request, specifying a byte range from one of the video files of the specified presentation.
[0004]
4. Method according to claim 3, characterized in that receiving the information comprises receiving information from a server device, the method further comprising sending the request to the server device.
类似技术:
公开号 | 公开日 | 专利标题
BR112012009832B1|2021-08-31|Stream of encoded video data
KR101703179B1|2017-02-06|Switching between adaptation sets during media streaming
TWI458340B|2014-10-21|Signaling data for multiplexing video components
JP6655091B2|2020-02-26|Low latency video streaming
JP5770345B2|2015-08-26|Video switching for streaming video data
ES2796535T3|2020-11-27|Provide stream data sets for streaming video data
KR101558116B1|2015-10-06|Switching between representations during network streaming of coded multimedia data
TWI489843B|2015-06-21|Arranging sub-track fragments for streaming video data
KR20160110424A|2016-09-21|Robust live operation of dash
BR112013000861B1|2021-11-09|METHOD FOR SENDING ENCAPSULATED VIDEO DATA, METHOD FOR RECEIVING ENCAPSULATED VIDEO DATA, DEVICE AND COMPUTER-READABLE MEMORY
KR101436267B1|2014-08-29|Signaling data for multiplexing video components
BR112013002693B1|2021-10-26|FLAG ATTRIBUTES FOR NETWORK PLAYED VIDEO DATA
BR112013002686B1|2021-10-26|METHOD AND DEVICE TO RETRIEVE MULTIMEDIA DATA, METHOD AND DEVICE TO SEND INFORMATION TO MULTIMEDIA DATA, AND COMPUTER-READABLE MEMORY
BR112013002692B1|2021-10-26|METHOD AND DEVICE TO RETRIEVE MULTIMEDIA DATA, METHOD AND DEVICE TO SEND INFORMATION TO MULTIMEDIA DATA, AND COMPUTER-READABLE MEMORY
同族专利:
公开号 | 公开日
JP5619908B2|2014-11-05|
US8914835B2|2014-12-16|
EP3038367B1|2019-06-19|
EP2494779A1|2012-09-05|
KR20120085870A|2012-08-01|
EP3038367A1|2016-06-29|
JP2014212538A|2014-11-13|
BR112012009832A2|2016-11-22|
CN102598688B|2016-08-31|
US20110099594A1|2011-04-28|
EP3598762A1|2020-01-22|
JP6054337B2|2016-12-27|
JP2013509818A|2013-03-14|
CN105187850A|2015-12-23|
KR101453239B1|2014-10-23|
TW201138471A|2011-11-01|
ES2746051T3|2020-03-04|
US8938767B2|2015-01-20|
WO2011053658A1|2011-05-05|
KR101396628B1|2014-05-16|
US20110196982A1|2011-08-11|
CN105187850B|2019-03-29|
CN102598688A|2012-07-18|
KR20130129481A|2013-11-28|
HUE045239T2|2019-12-30|
引用文献:
公开号 | 申请日 | 公开日 | 申请人 | 专利标题

US5886732A|1995-11-22|1999-03-23|Samsung Information Systems America|Set-top electronics and network interface unit arrangement|
JP2000201343A|1999-01-05|2000-07-18|Toshiba Corp|Moving image data reproducing device, moving image data dividing device and recording medium readable by computer in which program is recorded|
US7945688B1|2001-06-12|2011-05-17|Netapp, Inc.|Methods and apparatus for reducing streaming media data traffic bursts|
FI115418B|2001-09-20|2005-04-29|Oplayo Oy|Adaptive media stream|
FI20011871A|2001-09-24|2003-03-25|Nokia Corp|Processing of multimedia data|
AU2002358242A1|2002-01-11|2003-07-24|Koninklijke Philips Electronics N.V.|Transmission system|
DE60212383T2|2002-08-27|2006-10-19|Matsushita Electric Industrial Co., Ltd., Kadoma|Method for transmitting data streams with data segments of variable length|
DE10339535A1|2003-08-26|2005-03-24|Deutsche Thomson-Brandt Gmbh|Method for querying information relating to a network subscriber station in a network of distributed stations and network subscriber station for carrying out the method|
US20050102371A1|2003-11-07|2005-05-12|Emre Aksu|Streaming from a server to a client|
DE10353564A1|2003-11-14|2005-06-16|Deutsche Thomson-Brandt Gmbh|Method for the intermittent, discontinuous transmission of data in a network of distributed stations and network subscriber station as a request device in the implementation of such a method as well as network subscriber station as a source device in the implementation of such a method|
US20080281803A1|2003-12-22|2008-11-13|Koninklijke Philips Electronic, N.V.|Method of Transmitting Content With Adaptation of Encoding Characteristics|
US7836389B2|2004-04-16|2010-11-16|Avid Technology, Inc.|Editing system for audiovisual works and corresponding text for television news|
US20050254575A1|2004-05-12|2005-11-17|Nokia Corporation|Multiple interoperability points for scalable media coding and transmission|
US20070016549A1|2005-07-18|2007-01-18|Eastman Kodak Company|Method system, and digital media for controlling how digital assets are to be presented in a playback device|
US8788933B2|2005-12-01|2014-07-22|Nokia Corporation|Time-shifted presentation of media streams|
US7783773B2|2006-07-24|2010-08-24|Microsoft Corporation|Glitch-free media streaming|
US8467457B2|2007-01-16|2013-06-18|Mobixell Networks Ltd|System and a method for controlling one or more signal sequences characteristics|
KR101366803B1|2007-04-16|2014-02-24|삼성전자주식회사|Communication method and apparatus using hyper text transfer protocol|
US8621044B2|2009-03-16|2013-12-31|Microsoft Corporation|Smooth, stateless client media streaming|
US20100312828A1|2009-06-03|2010-12-09|Mobixell Networks Ltd.|Server-controlled download of streaming media files|
US8631455B2|2009-07-24|2014-01-14|Netflix, Inc.|Adaptive streaming for digital content distribution|
US9203816B2|2009-09-04|2015-12-01|Echostar Technologies L.L.C.|Controlling access to copies of media content by a client device|
US8484368B2|2009-10-02|2013-07-09|Disney Enterprises, Inc.|Method and system for optimizing download and instantaneous viewing of media files|
US8914835B2|2009-10-28|2014-12-16|Qualcomm Incorporated|Streaming encoded video data|
US9226045B2|2010-08-05|2015-12-29|Qualcomm Incorporated|Signaling attributes for network-streamed video data|US8935316B2|2005-01-14|2015-01-13|Citrix Systems, Inc.|Methods and systems for in-session playback on a local machine of remotely-stored and real time presentation layer protocol data|
US20060159432A1|2005-01-14|2006-07-20|Citrix Systems, Inc.|System and methods for automatic time-warped playback in rendering a recorded computer session|
US9826197B2|2007-01-12|2017-11-21|Activevideo Networks, Inc.|Providing television broadcasts over a managed network and interactive content over an unmanaged network to a client device|
US8914835B2|2009-10-28|2014-12-16|Qualcomm Incorporated|Streaming encoded video data|
WO2011057012A1|2009-11-04|2011-05-12|Huawei Technologies Co., Ltd|System and method for media content streaming|
CN102055718B|2009-11-09|2014-12-31|华为技术有限公司|Method, device and system for layering request content in http streaming system|
KR101750048B1|2009-11-13|2017-07-03|삼성전자주식회사|Method and apparatus for providing trick play service|
CA2786812C|2010-01-18|2018-03-20|Telefonaktiebolaget L M Ericsson |Method and arrangement for supporting playout of content|
WO2011087449A1|2010-01-18|2011-07-21|Telefonaktiebolaget L M Ericsson |Methods and arrangements for http media stream distribution|
US20110179185A1|2010-01-20|2011-07-21|Futurewei Technologies, Inc.|System and Method for Adaptive Differentiated Streaming|
AU2011218489B2|2010-02-19|2015-08-13|Telefonaktiebolaget L M Ericsson |Method and arrangement for adaption in HTTP streaming|
KR101000063B1|2010-04-27|2010-12-10|엘지전자 주식회사|Image display apparatus and method for operating the same|
US9497290B2|2010-06-14|2016-11-15|Blackberry Limited|Media presentation description delta file for HTTP streaming|
CN102291373B|2010-06-15|2016-08-31|华为技术有限公司|The update method of meta data file, device and system|
GB201010456D0|2010-06-22|2010-08-04|Vodafone Ip Licensing Ltd|Congestion control for streaming data|
KR101705813B1|2010-06-23|2017-02-10|삼성전자주식회사|Method and apparatus for random access of multimedia contents in wireless communication system|
EP2613464A4|2010-08-31|2015-04-29|Humax Holdings Co Ltd|Methods of transmitting and receiving a media information file for http streaming|
WO2012030178A2|2010-09-01|2012-03-08|한국전자통신연구원|Method and device for providing streaming content|
CN102137137B|2010-09-17|2013-11-06|华为技术有限公司|Method, device and system for dynamic inter-cut of media contents based on HTTPstream|
US20120089781A1|2010-10-11|2012-04-12|Sandeep Ranade|Mechanism for retrieving compressed data from a storage cloud|
KR20120079880A|2011-01-06|2012-07-16|삼성전자주식회사|Apparatus and method for generating bookmark in streaming service system|
KR20120083747A|2011-01-18|2012-07-26|삼성전자주식회사|Method and apparatus for transmission in integrating system of broadcasting-communication service and multimedia service|
US9661104B2|2011-02-07|2017-05-23|Blackberry Limited|Method and apparatus for receiving presentation metadata|
US9860293B2|2011-03-16|2018-01-02|Electronics And Telecommunications Research Institute|Apparatus and method for providing streaming content using representations|
US9800945B2|2012-04-03|2017-10-24|Activevideo Networks, Inc.|Class-based intelligent multiplexing over unmanaged networks|
CA2773924C|2011-04-11|2020-10-27|Evertz Microsystems Ltd.|Methods and systems for network based video clip generation and management|
EP2547062B1|2011-07-14|2016-03-16|Nxp B.V.|Media streaming with adaptation|
GB2495268B|2011-08-05|2019-09-04|Quantel Ltd|Methods and systems for providing file data for media files|
WO2013021167A1|2011-08-05|2013-02-14|Quantel Limited|Methods and systems for providing file data for video files|
US8818171B2|2011-08-30|2014-08-26|Kourosh Soroushian|Systems and methods for encoding alternative streams of video for playback on playback devices having predetermined display aspect ratios and network connection maximum data rates|
US9955195B2|2011-08-30|2018-04-24|Divx, Llc|Systems and methods for encoding and streaming video encoded using a plurality of maximum bitrate levels|
US9467708B2|2011-08-30|2016-10-11|Sonic Ip, Inc.|Selection of resolutions for seamless resolution switching of multimedia content|
US8806188B2|2011-08-31|2014-08-12|Sonic Ip, Inc.|Systems and methods for performing adaptive bitrate streaming using automatically generated top level index files|
US8615159B2|2011-09-20|2013-12-24|Citrix Systems, Inc.|Methods and systems for cataloging text in a recorded session|
EP3968691A1|2011-10-21|2022-03-16|FRAUNHOFER-GESELLSCHAFT zur Förderung der angewandten Forschung e.V.|Resource management concept|
EP2595399A1|2011-11-16|2013-05-22|Thomson Licensing|Method of digital content version switching and corresponding device|
KR101922552B1|2011-12-06|2018-11-29|삼성전자주식회사|Method amd apparatus for controlling traffic using adaptive streaming in multimedia content content transmission system|
US10409445B2|2012-01-09|2019-09-10|Activevideo Networks, Inc.|Rendering of an interactive lean-backward user interface on a television|
US9621894B2|2012-01-13|2017-04-11|Qualcomm Incorporated|Determining contexts for coding transform coefficient data in video coding|
KR101904053B1|2012-03-13|2018-11-30|삼성전자 주식회사|Apparatus and method for processing a multimedia data in terminal equipment|
US8813117B1|2012-04-27|2014-08-19|Google Inc.|Content subset conditional access framework|
KR101692516B1|2012-07-09|2017-01-03|브이아이디 스케일, 인크.|Power aware video decoding and streaming|
WO2014022060A1|2012-07-09|2014-02-06|Huawei Technologies Co., Ltd.|Dynamic adaptive streaming over http client behavior framework and implementation of session management|
WO2014025884A2|2012-08-07|2014-02-13|Visible World, Inc.|Systems, methods and computer-readable media for resource-based allocation of content transmitted in a media network|
US9432664B2|2012-09-28|2016-08-30|Qualcomm Incorporated|Signaling layer identifiers for operation points in video coding|
US8949206B2|2012-10-04|2015-02-03|Ericsson Television Inc.|System and method for creating multiple versions of a descriptor file|
JP6236459B2|2012-10-19|2017-11-22|インターデイジタル パテント ホールディングス インコーポレイテッド|Multiple hypothesis rate adaptation for HTTP streaming|
US20140156865A1|2012-11-30|2014-06-05|Futurewei Technologies, Inc.|Generic Substitution Parameters in DASH|
CN103929404B|2013-01-11|2017-02-15|中国科学院声学研究所|Method for analyzing HTTP chunked code data|
HUE037479T2|2013-01-17|2018-08-28|Intel Ip Corp|Content url authentication for dash|
US9432426B2|2013-02-04|2016-08-30|Qualcomm Incorporated|Determining available media data for network streaming|
EP3499905A1|2013-03-06|2019-06-19|InterDigital Patent Holdings, Inc.|Power aware adaption for video streaming|
WO2014145921A1|2013-03-15|2014-09-18|Activevideo Networks, Inc.|A multiple-mode system and method for providing user selectable video content|
SG11201508357TA|2013-04-19|2015-11-27|Sony Corp|Server device, client device, content distribution method, and computer program|
EP3005712A1|2013-06-06|2016-04-13|ActiveVideo Networks, Inc.|Overlay rendering of user interface onto source video|
JP6465541B2|2013-08-06|2019-02-06|キヤノン株式会社|Communication device, reproduction device, its method, and program|
US20150095450A1|2013-09-30|2015-04-02|Qualcomm Incorporated|Utilizing multiple switchable adaptation sets for streaming media data|
US9807452B2|2013-10-07|2017-10-31|Samsung Electronics Co., Ltd.|Practical delivery of high quality video using dynamic adaptive hypertext transport protocolstreamingwithout using HTTP in a broadcast network|
KR102064792B1|2013-12-17|2020-01-10|한국전자통신연구원|Method and system for generating bandwidth adaptive segment file for http based multimedia streaming service|
KR20150083429A|2014-01-08|2015-07-17|한국전자통신연구원|Method of representing bit depth for video play using dash|
US10504200B2|2014-03-13|2019-12-10|Verance Corporation|Metadata acquisition using embedded watermarks|
WO2015138798A1|2014-03-13|2015-09-17|Verance Corporation|Interactive content acquisition using embedded codes|
US20150261753A1|2014-03-13|2015-09-17|Verance Corporation|Metadata acquisition using embedded codes|
KR101759956B1|2014-04-09|2017-07-20|엘지전자 주식회사|Broadcast signal transmission apparatus, broadcast signal reception apparatus, broadcast signal transmission method, and broadcast signal reception method|
US9860612B2|2014-04-10|2018-01-02|Wowza Media Systems, LLC|Manifest generation and segment packetization|
US9788029B2|2014-04-25|2017-10-10|Activevideo Networks, Inc.|Intelligent multiplexing using class-based, multi-dimensioned decision logic for managed networks|
WO2015167187A1|2014-04-27|2015-11-05|엘지전자 주식회사|Broadcast signal transmitting apparatus, broadcast signal receiving apparatus, method for transmitting broadcast signal, and method for receiving broadcast signal|
CN103957471B|2014-05-05|2017-07-14|华为技术有限公司|The method and apparatus that Internet video is played|
WO2016028936A1|2014-08-20|2016-02-25|Verance Corporation|Watermark detection using a multiplicity of predicted patterns|
US10135748B2|2014-09-29|2018-11-20|Apple Inc.|Switching between media streams|
JP2016076758A|2014-10-03|2016-05-12|株式会社東芝|Reproducer, encoder, and reproduction method|
US10129839B2|2014-12-05|2018-11-13|Qualcomm Incorporated|Techniques for synchronizing timing of wireless streaming transmissions to multiple sink devices|
US9602891B2|2014-12-18|2017-03-21|Verance Corporation|Service signaling recovery for multimedia content using embedded watermarks|
US20160182594A1|2014-12-19|2016-06-23|Cable Television Laboratories, Inc.|Adaptive streaming|
ES2874748T3|2015-01-06|2021-11-05|Divx Llc|Systems and methods for encoding and sharing content between devices|
US10749930B2|2015-03-02|2020-08-18|Qualcomm Incorporated|Indication for partial segment|
US10659507B2|2015-03-02|2020-05-19|Qualcomm Incorporated|Indication for partial segment|
CN107534798A|2015-04-22|2018-01-02|Lg 电子株式会社|Broadcast singal sends equipment, broadcasting signal receiving, broadcast singal sending method and broadcast signal received method|
EP3314850A1|2015-06-29|2018-05-02|VID SCALE, Inc.|Dash caching proxy application|
US9940169B2|2015-07-23|2018-04-10|Pearson Education, Inc.|Real-time partitioned processing streaming|
US10735794B2|2016-03-28|2020-08-04|Sony Corporation|Information processing device, information processing method, and information processing system|
US10623755B2|2016-05-23|2020-04-14|Qualcomm Incorporated|End of sequence and end of bitstream NAL units in separate file tracks|
US10148989B2|2016-06-15|2018-12-04|Divx, Llc|Systems and methods for encoding video content|
JP6739383B2|2017-03-27|2020-08-12|シャープ株式会社|Display device, content display method, and program|
US10587904B2|2017-07-10|2020-03-10|Qualcomm Incorporated|Processing media data using an omnidirectional media format|
CN108924667B|2018-08-03|2021-01-15|阳雨哲|Available bandwidth self-adaptive video fragment request method supporting QoE maximization|
法律状态:
2019-01-08| B06F| Objections, documents and/or translations needed after an examination request according [chapter 6.6 patent gazette]|
2020-02-04| B15K| Others concerning applications: alteration of classification|Free format text: AS CLASSIFICACOES ANTERIORES ERAM: H04N 7/24 , H04L 29/06 Ipc: H04N 21/2343 (2011.01), H04L 29/06 (2006.01), H04N |
2020-02-04| B06U| Preliminary requirement: requests with searches performed by other patent offices: procedure suspended [chapter 6.21 patent gazette]|
2021-06-15| B09A| Decision: intention to grant [chapter 9.1 patent gazette]|
2021-08-31| B16A| Patent or certificate of addition of invention granted [chapter 16.1 patent gazette]|Free format text: PRAZO DE VALIDADE: 20 (VINTE) ANOS CONTADOS A PARTIR DE 27/10/2010, OBSERVADAS AS CONDICOES LEGAIS. PATENTE CONCEDIDA CONFORME ADI 5.529/DF, QUE DETERMINA A ALTERACAO DO PRAZO DE CONCESSAO. |
优先权:
申请号 | 申请日 | 专利标题
US25576709P| true| 2009-10-28|2009-10-28|
US61/255,767|2009-10-28|
US12/785,770|2010-05-24|
US12/785,770|US8914835B2|2009-10-28|2010-05-24|Streaming encoded video data|
PCT/US2010/054334|WO2011053658A1|2009-10-28|2010-10-27|Streaming encoded video data|
[返回顶部]